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ON  TWO-SIDED  CONFIDENCE  AND  TOLERANCE  LIMITS 
FOR  NORMAL  DISTRIBUTIONS 

I.  INTRODUCTION 

In  many  cases  of  statistical  inference  it  is  more 
meaningful  and  informative  to  construct  confidence  intervals 
for  parameters  under  investigation  rather  than  to  make  tests 
of  hypotheses.  This  requires  some  understanding  of  the  con¬ 
cept  of  confidence  intervals.  Coupled  with  the  under¬ 
standing  of  confidence  intervals  is  the  understanding  of 
tolerance  limits.  Frequently  one  finds  that  confidence 
limits  are  used  when  tolerance  limits  should  be  used,  or 
confidence  limits  are  computed  with  the  general  interpreta¬ 
tion  of  tolerance  limits. 

In  this  report  confidence  limits  and  two  types  of 
tolerance  limits  are  described  for  normal  distributions 
giving  some  theorems  on  which  the  concept  and  construction 
of  these  limits  are  based.  Differences  and  similarities  be¬ 
tween  the  three  types  of  limits  are  pointed  out.  Procedures 
are  presented  for  computing  two-sided  confidence  and  toler¬ 
ance  limits  for  means  and  for  simple  linear  regression  data 
(simultaneous  and  non-simultaneous  limits  for  each  type). 

For  comparative  purposes,  the  six  different  types  of  limits 
are  computed  on  a  numerical  regression  problem. 

Finally,  an  additional  bibliography  is  included  for 
reference  on  confidence  and  tolerance  limits  when  infor¬ 
mation  other  than  what  is  given  in  the  paper  is  desired. 
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II.  CONFIDENCE  LIMITS 


Suppose  a  random  sample  of  n  observations 

Y  )  is  drawn  from  a  normal  population  in  an  attempt  to  ob- 
n 

tain  some  information  about  the  mean  of  the  population,  p. 

A  point  estimate  of  the  parameter  p  is  the  sample  mean,  Y. 
Although  the  estimate  is  unbiased  it  is  not  very  meaningful 
without  some  measure  of  the  possible  error.  Thus,  frequently 
one  determines  an  upper  and  a  lower  limit  or  a  conf idence 
interval  which  is  rather  certain  to  contain  p. 

The  general  method  of  construction  of  confidence 
limits  is  as  follows  (A).  Suppose  one  has  a  family  of  pop¬ 
ulations  each  with  a  known  density  function  p(y:cp),  y  being 
the  random  variable  and  the  parameter  in  question.  Sup¬ 
pose  one  has  an  estimator  g  to  estimate  <p ,  where  g  is  a 
function  of  the  observed  y,  and  suppose  that  one  can  derive 
the  density  function  of  g,  p(g:cp).  Now  if  one  assumes  that 
cp  equals  some  particular  value,  say  q? ' ,  then  this  value  can 
be  inserted  and  the  density  function  p(g:cp'),  the  distribu¬ 
tion  of  g  under  this  assumption,  can  be  obtained. 

Under  the  assumption  cp  =  cp ' ,  there  will  be  a  P^  point 
for  the  distribution  of  g,  say  g^,  which  will  be  determined 
by 

g 

Pr[s<g.:cp  =cp'J“  /  1  p(g  :© ' )  dg  =  P.. 

“  *00 


Likewise,  under  the  same  assumption  there  will  be  a  P^  point 


for  the  distribution  of  g,  say  g^,  determined  by 


Pr[g  >  «2:flp  =  qp ’]  =  P(s^p')  dg  =  l-Pj  *  (2.1) 


The  area  tinder  the  density  function  below  g^  is  equal  to  P2 , 
and  the  area  between  g^  and  g2  1*  then  equal  to  (P2-P^)  = 

Y,  ««y. 


Now,  if  the  value  of  ©*  is  changed,  the  corresponding 
values  of  g^  and  g2  are  changed.  Therefore  g^  and  g2  can  be 
regarded  as  functions  of  cp,  say  g^(cp)  and  g2(«p)»  respectively. 
In  principle,  one  can  plot  these  functions  g^(«p)  and  g£(cp) 
against  (see  Figure  1). 

Now  assume  that  the  true  value  of  <p  is  actually  cpQ. 

Then  g^Op)  and  g2<cp)  take  the  values  g^^  end  g2(cp0),  re¬ 
spectively,  and  Pr[g  <  g1(tpQ)J  *  Pj_,  Pr[g  >  g2(q>Q)]  *  l-p2» 
which  imply 


<  g  <  g2Cp0)]=  Pg-Pj^  =  Y.  (2.2) 

Now  suppose  that  a  sample  observation  was  taken  and  that  a 
numerical  value  of  the  estimate,  say  gQ,  was  computed.  Then, 
in  Figure  1,  a  horizontal  line  can  be  drawn  parallel  to  the 
<p  axis  through  the  point  gQ  on  the  g  axis.  Let  this  line 
intercept  the  two  curves  g2(y)  and  g^(cp)  at  points  A  and  3. 
Project  the  points  A  and  B  on  to  the  ®  axis  to  give  and  y . 
One  asserts  that  a  (P2-Pj.)  confidence  interval  for  cp  is 
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(cp  )  >  i .  e . 

Prj^cp  <  cp  <  crj  ~  P2"P^  =  Y*  (2.3) 

The  justification  for  this  assertion  is  as  follows.  Enter 
the  true  value  of  cpQ  on  the  cp  axis;  erect  the  perpendicular 
at  this  point  to  cut  the  curves  g^(cp)  at  C  and  82 (cp)  at  D. 

At  both  these  points  cp  has  the  values  cpQ ;  so,  at  C,  %  ~ 
g^(tp0) »  and»  at  D,  g  =  g2  (cpQ) •  T*1®  horizontal  lines  through 

C  and  D  will  intersect  the  g  axis  at  gj/cp,,)  and  g2(cp0),  re¬ 
spectively.  Now  cpQ  may  be  anywhere  on  the  cp  axis,  but  if  AB 
intersects  CD,  then  gQ  must  lie  in  the  interval  (g^(cpQ), 
g2(cpD))  and  simultaneously  the  interval  (cp  ,cr )  rcust  include 
cd  .  In  other  words,  the  two  statements 

(i)  gQ  lies  in  the  interval  (g^C^),  82(cp0))» 

and 

(11)  the  interval  (cp,cp)  includes  cpQ, 
are  always  true  simultaneously  or  not  true  simultaneously. 

But  by  (2.2)  the  event  (i)  has  probability  BO  th® 

event  (ii)  must  also  have  probability  (P^P^).  Hence  one  can 
write 

Pr[cp  <  cpQ  <  cp ]  a  P2"P1  =  Y 

and  this  completes  the  justification  of  (2.3). 

At  the  point  A,  the  function  g2(qp)  has  cp  a  cp  and  takes 
on  the  value  gQ,  i.e^  82(cp)  =  80*  Now  g2(cp)  was  defined  as 
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the  solution  of  (2.1),  so  one  can  use  this  equation  to  find 
cp ;  cp  is  obtained  by  solving 


p(g  tcp )  dg  =  l-?2  =  Pr 


Similarly,  at  the  point  B,  the  function  g^Ccp)  has  cp  *  q>  and 
takes  the  value  gQ;  so  g^Op)  =  ?Q  and  can  be  found  as  the 
solution  f 


p(g:cp)  dg  =  *  Pr[g  < 


8o; 


5P 


To  determine, f or  instance,  confidence  intervals  for 
the  population  mean  pi,  one  must  seek  a  random  variable  which 
depends  on  p,  no  other  unknown  parameters,  and  the  sample 
random  variables,  whose  distribution  is  known.  For  the 
normally  distributed  variable  with  cr  unknown  the  quantity 


t  = 


(Y-u  )</n 


is  such  a  random  variable  having  Student' s-t  distribution 
with  n-1  degrees  of  freedom  (df),  where 


s  = 


n  S  Y2  -(  2  Y  )2 
1=1  i  1=1  i ' 


n(n-l) 

2  •> 
s  being  an  unbiased  estimate  of  o  . 
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Before  proceeding  with  the  derivation  of  the  confi¬ 


dence  interval,  we  shalL  recall  the  definition  of  Student *  s- 
t  distribution  (5).  A  random  variable  has  Student 's-t  dis¬ 
tribution  with  n-1  df  if  it  has  the  same  distribution  as  the 
quotient  (uVn-l)/v,  where  u  and  v  are  independent  random 
variables,  u  having  a  normal  distribution  with  mean  0  and 
standard  deviation  1,  and  v^  having  a  chi-square  (y2)  distri¬ 
bution  with  n-1  df.  More  precisely,  ((Y-p)«yn)/o  is  normally 
distributed  with  mean  0  and  variance  1,  and  s^/c^  is  distri¬ 
buted  (independently)  as  yVn-i  with  n-1  df. 

From  tables  of  the  Student* s-t  distribution  one  de¬ 
termines  two  percentiles,  e(i-Y)/2  B-1  and  t(l+Y)/2,n-l  » 
such  that* 

B.(l+Y)/2,n-l 

Fr[t(l-Y)/2,n-l<t<t(W)/2,n-llc  I  dt  *  y 

where  »  -2. 


*In  hypothesis  testing  one  rejects  the  hypothesis  that  u  = 
u  if  t  falls  outside  this  interval  where  the  alternate 
hypothesis  is  that  u  £  u _.  This  represents  a  test  of  size 
1-Y.  0 
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Or,  more  precisely. 


r  (Y-n  )Jn  , 

^I'a-yJ/a.n-l  <  — ~  <  t(l+y)/2,n-lJ-  Y' 

This  inequality  is  then  converted  to 

PrtY't(l+Y)/2,n-l  ^  -  M  -  Y"t(l-Y)/2,n-l  J  *  Y*  (2*4) 

This  interval, a  confidence  interval  for  p,  is  given  in  most 
standard  statistical  texts  (16).  Owing  to  the  fact  that 
Student' s-t  distribution  is  symmetric,  t(i„y)/2  n-1  = 
‘t^+y)^  n-1*  ^h*-8  f®ct  will  be  used  throughout  the  re¬ 

mainder  of  the  paper. 

For  the  case  where  o  is  known  one  can  use  (2.4)  for 
the  computation  of  the  confidence  interval  by  simply  re¬ 
placing  s  by  o  and  using  for  df  *  <*>,  t(l+Y)/2,eo  =  Z(l+y)/2* 
the  (l+y)/2  normal  deviate,  since  Student' s-t  distribution 
approaches  the  normal  distribution  for  large  degrees  of  free¬ 
dom. 

The  interpretation  of  confidence  limits  is  as  follows. 
If  many  samples  of  size  n  were  drawn  from  the  same  popula¬ 
tion  and  100y%  upper  and  lower  limits  were  determined  from 
each  sample,  then  one  would  expect  100y%  of  these  "random 
intervals"  to  cover  the  point,  p.  Or,  if  an  experimenter  as¬ 
serts  &  priori  that  an  interval  includes  the  parameter,  p. 
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he  should  be  making  a  correct  statement  I00y%  of  the  time. 
In  practice,  one  usually  has  only  one  sample  from  which  to 
determine  an  interval  estimate. 

One  should  remember  in  the  above  discussion  and 
throughout  the  rest  of  the  paper,  that  upper  and  lower 
limits  are  computed  but  that  frequently  it  is  more  conven¬ 
ient  to  speak  of  the  interval  formed  by  the  limits. 

Moment  generating  functions  may  be  used  to  show  that 
a  linearly  transformed  normal  random  variable  is  normally 
distributed  and  that  any  linear  combination  of  independent 
normal  random  variables  has  a  normal  distribution  (5).  The 
following  general  procedure  (Procedure  A)  may  then  be  used 
for  the  computation  of  confidence  limits  on  any  parameter 
or  linear  function  of  parameters  cp  from  normal  populations 
[e.g.  qp  *  n,  qp  =  or  <P  a  3*]  : 

Procedure  A 
1.  Obtain  an  estimator  g  of  cp 

e.g.  g  ■  Y,  g  =  or  8  *  b** 


^population  regression  coefficient 
**  EXj^  *  (2X1)(IY1)/n  Sxy 


EX2  -  (EX^/n  Sx2 

n 

where  E  =  E 
i=l 
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2. *  Obtain  the  variance  of  g  and  write  it  in  the  form 

a2/n' 

e.g.  var  Y  =  a2/ n,  var (Y^-Y^ )  «(j^-  +  °  » 

or  var(b)  =  02/Sx2 

3.  Obtain  an  unbiased  estimate  of  o2 (usually  ealled  s2) 


e.g. 


or  s  = 


2  _ 


EY^  -  (EYi)2/n  Sy2 

n-1  n-1 

Sy2  +  Sy2 
nl+n2"2 

Sy2  -  (Sxy)2/Sx2 
n-2 


4.  Confidence  interval  estimate  for  cp 

8  ±  C(l+Y)/2,f  ^n*  8 

where  t(1+Y)/2  £  is  the  ^1+Y^2  percentage  point 
of  Student *s-t  distribution  with  f  df  (in  the 
examples  f=n-l,  or  n-2,  respectively) 


★The  use  of  n*  will  be  explained  in  the  section  on  tolerance 
limits. 

**  Assuming  that  both  populations  have  a  common  a  . 

(l-Y)/2,f 


***  Remember  t 


(1+Y )/2 ,f * 


III.  TOLERANCE  LIMITS 


A.  General  Meaning  of  Tolerance  Limits 

Suppose  a  random  sample  of  n  observations 
Y  )  is  drawn  from  a  normal  population  with  unknown  mean,  p, 
and  unknown  variance,  a  2 .  Also  suppose  the  experimenter  is 
not  interested  in  estimating  ^  as  a  single  point,  nor  is  he 
interested  in  finding  confidence  limits  for  p.  He  is  more 
concerned  about  predicting  individual  future  values  and 
would  like  to  see  limits  where  he  can  say  with  reasonable 
assurance  that  most  of  his  future  values  will  fall  within. 
If  he  constructed  these  limits,  which  one  calls  tolerance 
limits,  on  his  control  data  (normal  range),  then  Individual 
values  falling  outside  these  limits  could  be  considered  as 
being  "abnormal"  with  a  reasonable  level  of  confidence. 

Before  proceeding  to  the  details  of  two  different 
types  of  tolerance  limits,  the  following  remarks  are  made 
to  give  the  reader  a  better  understanding  of  the  general 
nature  of  the  limits.  For  the  moment,  consider  a  normally 
diatributed  population  with  a  known  population  mean,  p,  and 
a  known  population  variance,  o^.  One  finds  the  two-sided 
tolerance  limits  which  Include  100P%  of  the  population  as 
p-Z a  and  p+Zo  since 
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H+Zo 

/ p(x)  dx  =  P 
p-Zo 

where  p(x)  represents  the  density  function  of  the  normal 
distribution  and  Z  is  a  numerical  value  which  depends  on  the 
chosen  value  of  P.  Since  the  population  parameters  are 
known,  the  above  statement  can  be  made  with  100%  confidence, 
and  one  hardly  has  a  statistical  problem.  For  example,  one 
is  100%  confident  that  the  tolerance  limits,  ^  +  1.96a,  con¬ 
tain  the  central  95%  of  the  normal  population. 

Usually  the  parameters  and  are  not  known,  only 
the  estimates  Y  and  s^.  If  ^  and  a  are  replaced  by  Y  and  s 
one  would  get  Y  +  1,96a  as  limits  in  the  above  example.  In 
repeated  sampling  from  the  same  population  these  limits 
would  vary  about  the  population  tolerance  limits,  p  +  1.96o, 
and  for  some  samples  the  limits  would  include  less  than  95% 
of  the  population  and  for  other  samples  more  than  95%.  To 
be  reasonably  sure  that  100P%  of  the  population  lie  between 
the  sample  tolerance  limits  one  must  find  a  value  k>Z  such 
that  there  is  a  good  chance  that  Y  +  ks  will  include  100P% 
of  the  population. 

Two  types  of  tolerance  limits  will  be  discussed: 
tolerance  limits  without  confidence  probability  [(P)TL],and 
tolerance  limits  with  confidence  probability  [(P,y)Tl], 
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The  problem  here  Is  to  determine  k  so  that  for  re¬ 
peated  samples  of  size  n  the  average  proportion  in  +  ks^ 
(1=1,2,...)  is  equal  to  P.  Wilks  (20)  first  determined  such 
a  k,  but  the  proof  given  in  this  paper  is  the  proof  by  l.R. 
Savage  found  in  an  article  by  Proschan  (14). 

Let  us  consider  as  tolerance  limits  and  the 
quantities  Y  +  ks  (two-sided  limits).  The  proportion  P*  of 
the  normal  population  between  these  limits  is 


We  wish  to  determine  k  so  that  E(P* )  =  P,  where 


and  f(Y,s)  is  the  distribution  of  Y  and  s  given  by 
JS  (n-l)<n*1>/2  .n"2  .-[n<*-4>2+<»-U»2]A>2 

^  o"  JS  RBfi) 

Using  the  linear  transformation,  Z  =  (Y-p)/o,  E(p')  can  be 
written  as 
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00 


00 


tsLr'i 

dk 


dk 

Let  y 

dk 


Hence 

where 

equal 


=c 


-u2/2  du  n-1  -[n-l+k2n/ (n+1)]  s2/2 


JZ+i 


ds 


J&L.  8n-1 


n-l+k2n/ (n+1)] b2/2 


ds 


=  c. 


00 

j s^-l  e"[n“ 

J0 


l+k2n/(n+l)] s2/2 


ds 


[n-l+k2n/(n+l)j  s2/2. 


J. 


CO 

f^-vn  y(n- 2)/2  a-y^n.i+k2n/(n+l)j  dy 


=  c 


3[^5^p 

r2 

E(P*  )  =  c,  /  - - ^ - —77- 


k.  and  k  are  Co  be  chosen  so  that  the  Integral  is 
to  P.  Let 


t  =  k' 


n/X 

V  n+l 
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so  that  E(P*  ) 


;  i  - ££ - — 

4  /  (n-l+t2)n/2 


But  the  integrand  is  essentially  Student' s-t  density  func¬ 
tion  with  n-1  df,  and  when  k  and  k  =  -«  and  +<®,  respective- 

1  2 

ly,  E(P*  )  =  1.  Hence  c^_  must  be  identical  to  the  constant  of 
Student's-t  distribution.  Hence  for  E(P* )  =  P  it  follows  that 

tl  ’  t(l-P)/2,n-l  and  t2  “  t(l+P)/2,n-l*  Since  t(l-P)/2 ,n-l 
’  -c(l+P)/2,n-l»  k  s  ±t(l +P)/2,n-f\/“  for  tolerance  limits 
symmetric  about  Y. 

The  interval  estimates 


Yi  1  t(l+P)/2,n-l\/Knr 


(3.1) 


which,  on  the  average .  include  100P%  of  the  population  are 
referred  to  as  tolerance  limits  without  confidence  prob¬ 
ability  or  in  this  paper  simply  as  (P)TL.  Thus,  when  many 
samples  of  the  same  size  are  taken  from  the  population  and 
a  (F)TL  is  calculated  each  time  (same  P),  these  intervals 
will  the  average  include  100P%  of  the  population.  If  the 
experimenter  asserts  a  priori  that  an  interval  estimate  con¬ 
tains  100P%  of  the  population,  he  stands  a  good  chance  that 
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the  interval  contains  in  the  neighborhood  of  100P%,  but  his 
estimate  may  include  considerably  more  or  considerably  less 
than  the  desired  100 P%.  All  one  does  know  is  that  the 
average  of  many  of  such  interval  estimates  (expected  value) 
contains  100P%  of  the  population. 

At  this  point  it  is  not  easy  to  see  how  one  could 
generalize  the  above  result  in  order  to  compute  a  (P)TL  for 
any  variate  for  which  there  is  a  normally  distributed  es¬ 
timate  of  the  mean  with  variance  a^/n'  and  the  estimate  of 
the  variance  is  independently  distributed  as  with  f  df. 

The  approach  one  can  use  in  generalizing  the  procedure  will 
be  shown  in  the  next  section  when  considering  the  similarity 
between  confidence  limits  and  (P)TL  (see  page  3i). 

C.  Tolerance  Limits  With  Confidence  Probability  [xy.g)nl 

For  many  situations  the  above  tolerance  interval 
estimate  is  not  too  useful  without  some  measure  of  the 
possible  error  associated  with  it.  Another  factor  which  may 
disturb  some  experimenters  about  the  (P)TL  is  that  per  in¬ 
terval  estimate  one  has  little  assurance  of  always  containing 
100P%  or  more  of  the  population.  Thus,  tolerance  limits 
with  confidence  probability  came  into  being.  In  this  paper 
these  tolerance  limits  will  be  referred  to  as  (y,P)TL,  based 
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on 


the  notation  in  (8)*. 

The  problem  is  to  find  that  value  of  k  in 


g+ks 

[  .L^sd 1 


J  g-ks 


such  that  Pt[a>p]  =  Y.  a  is  the  proportion  of  the  popula¬ 
tion  actually  included  in  a  &iv£n  interval,  Y  is  the  re¬ 
quired  confidence  coefficient,  and  P  is  the  proportion  of 
the  population  required  to  be  included  within  the  limits 
g+ks  where  g  is  an  estimate  of  ;p ,  the  mean  of  the  normal 

population. 

Wald  and  Wolfowitz  (17)  have  shown  how  values  of 
k  may  be  determined  to  an  extremely  good  approximation  when 
P  and  y  are  specified.  They  considered  only  the  case  in 
which  a  random  sample  of  n  is  drawn  from  a  single  normal 
population  of  unknown  mean  and  unknown  variance  (f  *  n-1). 
Wallis  (18)  extended  their  results  to  cover  any  normally 
distributed  variable  for  whose  mean  there  is  a  normally 


*ln(8),  at  least- a  proportion  of  the  population  : Ls  averted 
to  lie  within  the  tolerance  limits  with  c°n^ence 
ability  This  notation  was  used  in  (17)  and  may  be  en 
countered  in  other  texts  or  articles. 
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distributed  estimate  with  variance  o^/n'  (Wallis  called  it 
N')  and  for  whose  variance  there  is  an  estimate  independently 
distributed  as  (f  not  necessarily  equal  to  n-1  where 

n  is  the  sample  size  for  estimating  the  mean).  The  n’  is 
the  effective  number  of  observations;  thus,  the  effective 
number  of  observations  for  a  certain  statistic  whicn  when 
divided  into  the  variance  of  an  observation,  gives  the 
variance  of  the  statistic. 

Wallis  summarized  the  Wald -Wolfowitz  derivation  of 
tolerance  factors  without  assuming  any  connection  between  n* 
and  f,  and  the  following  is  based  on  his  summary. 

Given  a  statistic  g  having  the  following  character¬ 
istics  : 

(i)  It  is  normally  distributed 

(ii)  Its  expected  value  <p  is  the  mean  of  a 
normal  population  with  unknown  variance 

(iii) lt  has  variance  equal  to  a^/n • #  where  n'  is  known, 
and  an  independent  estimate  s^  of  is  distributed 
as  a^x^/f  with  f  degrees  of  freedom. 

The  distribution  of  A  above  is  clearly  independent 
of  jp  and  o,  since  cp  merely  determines  the  point  about  which 
g  will  be  distributed  and  the  variance  of  s  is  proportional 
to  o,  so  without  loss  of  generality  take  cp  =  0  and  o  =  1  in 
the  further  computation. 

Pt[a>p1  depends  on  P,  k,  n'  and  n.  To  emphasize 
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the  dependence  on  P  and  k  for  given  n'  and  n,  let  F(P,k)  = 
Pr(A>P).  Also,  denote  the  conditional  probability  of  A* a 
exceeding  P  for  a  particular  value  of  g  by  F(P,k|g),  i.e. 
F(P,k|  g )  =  Pt[a>p!r]  . 

If  F(P,k!g)  is  known,  then  F(P,k)  may  be  found  by 
forming  the  product 

F(P,k  g)  Vi  d*  » 


which  represents  the  probability  that  g  will  lie  in  an  in¬ 
terval  of  length  dg  and  that  A  will  exceed  P  for  given  g. 
If  one  integrates  out  g ,  the  result  is  also  equal  to  the 
expectation  of  F(P,k' g)  as  follows: 


F(P,k)  = 


00 

Jf( P,k|g)  e"%n'g2  dg  a 


EgF(P,k|g) 


F(P,k)  can  be  approximated  by  expanding  F(P,k|g)  in  a  Taylor 
series*  at  g=0  and  taking  expectations. 

Since  F(P,kjg)  is  an  even  function  of  g,  its  odd 
derivatives  are  zero,  and  the  Taylor  expansion  about  g=0  is 


F(P,k|g)  =  F(P,k|  0)  +  OLl  +  i-1—  +  .. 

2!8gZ  4!dg4 

with  all  derivatives  to  be  evaluated  at  g=0 . 


(3.2) 


*Wald  and  Wolfowitz  show  the  validity  of  the  Taylor  expan¬ 
sion. 


24 


Taking  expectations,  F(P,k)  =  EF(P,k|g)  = 

F(P,kiO)  +  -i-  <LJE  +  -1—  +  ...  (3.3) 

2n*  &g2  8n'2  ag4 

since  the  second  and  fourth  moments  of  g,  which  is  normally 

2 

distributed  with  mean  0  and  variance  1/n' ,  are  1/n'  and  3/n' 
respectively. 

On  comparing  the  right  hand  sides  of  (3.2)  and  (3.3) 
one  sees  that  (3.2)  will  oecome  identical  with  (3.3),  except 
for  terms  involving  the  second  and  higher  even  powers  of 
1/n' .  Thus  if  one  seta  g  =  «/l/n'  then 

F(P,k|VT7nr)  a  F(P,k) 

This  means  that  in  order  to  obtain  F(P,k)  one  has 
to  evaluate  F(P,k \ Vl/n' ).  There  is  a  unique  value  of  r  such 
that 

l/Jn^+r 

JL _  [ b'z2/2  dZ  =*  P 
A/Vn^-r 

since  the  left  side  is  a  monotonic  increasing  function  of  r. 
The  r  corresponds  with  the  half  length  ks  of  an  interval 
centered  at  for  which  A  =  P. 

The  problem  is  to  select  k  large  enough,  in  the 
light  of  the  sampling  distribution  of  s,  to  make  the  prob- 
ability  y  that  ks  will  be  at  least  r.  Thus, 


F(P,k  jljri*  )  =  Pr(s>r  k)  =  Pr (x2>f r2  ’  k2 )  =  y 

since  x2  =  f82/o2  and  here  o  =  1.  This  probability  can  be 
evaluated  from  tables  of  the  chi-square  distribution,  after 
first  finding  r  from  tables  of  the  normal  distribution  using 
a  trial  and  error  method  or  Newton’s  method  (19). 

After  P  and  y  are  given,  one  solves  for  k  in 


x 


2  =  fr2/k2 

1-Y,f 


Pr 


X2  >  X2 

f  "  Xl-Y,f J 


,  where  x2  is  that  number  for  which 
1-Y,f 

=  Y  ?  then  k  =  ru  where  u  =  ,/f/x2  • 

1-Y*f 


The  interpretation  of  these  limits  is  as  follows. 
When  many  random  samples  of  the  same  size  are  taken  from  the 
normal  population  and  a  (y,P)TL  is  calculated  each  time, 
then  in  100y%  of  the  cases  these  limits  will  include  at 
least  100P%  of  the  population. 

The  following  procedure  (Procedure  B)  may  be  used 
to  compute  (y,P)TL  for  any  variate  for  which  there  is  a 
normally  distributed  estimate  of  the  mean  with  variance 
<j2/n'  and  an  estimate  of  the  variance  independently  dis¬ 
tributed  as  o2x2/f  with  f  df : 


Procedure  B 


1.  Obtain  an  estimate  g  of  the  population  mean 
(e.g.  g  =  Y,  g  =  Yj-^) 
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2. 

3. 

4. 

5. 


Obtain  var(g)  and  write  it  in  the  form  o2/n' 

(e.g.  var(Y)=(A)o2,  var(Y- -Y? )=(J-  +  J— )o2 ) 

n  1  Z  nl  n2 

Obtain  an  unbiased  estimate  of  a2 (usually  called 

s2,  with  f  df) 

Decide  on  reasonable  values  of  y  and  P 
Compute  r: 


r  =  Z(l+P)/2 


2Z(l+P)/2  ~3 
24n*2 


from  Bowker  (2),  where  Z(l+P)/2  18  the  (1+P)/2 
percentage  point  of  the  standard  normal  dis¬ 
tribution 
6 .  Compute  u : 

u**  =  JtFp-  where  *2  is  that  percentile 
1-Y,f  1-Y,f 

of  the  x2"distribution  with  f  df  which  will  be  ex¬ 
ceeded  by  chance  100y%  of  the  time. 


^Assuming  that  both  populations  have  a  common  variance  o2. 

**Dixon  and  Massey  (6)  give  «/F.  9  in  place  of  u.  How- 

ever  the  F,  0  should  read  M  n  for  the  appro- 

1-Y  ,°°,n-z  Y»“»n-Z 

priate  value  from  their  table  of  percentiles  of  the 

distributions.  The  n-2  is  associated  with  the  degrees  of 
freedom  for  error  in  their  regression  procedure. 


Cu 
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7.  Compute  1<  =  ru 

8.  (y,P)TL  =  g  +  k/s? 

Step  8  would  be  modified  to  read  as  g  +  if 

the  experimenter  were  interested  in  (y,P)TL  for  future  means 
based  on  m  observations  each  (7). 

Tabular  values  were  obtained  for  r  and  u  by  Weiss- 
berg  and  Beatty  (19),  and  their  values  are  also  given  in 
Owen's  Handbook  of  Statistical  Tables  (12).  The  tabulated 
values  for  r  were  prepared  for  a  sample  of  size  n  from  a 
single  population  and  are  given  as  r  ~  r(n,P).  One  needs 
to  let  n  =  n'  when  using  these  tables. 

Bowker  (2)  has  shown  that  for  large  n'  the  ex¬ 
pression  [  1  +  l/2n'j  may  be  used  for  r  instead  of 

the  expression  given  in  Step  5. 

Bowker  (3)  has  tabulated  values  of  k  for  the  special 
case  where  f  =  n-1. 

Situations  may  arise  where  p  or  a  is  known.  In  the 
event  that  p  is  known  and  o  is  unknown  one  can  use  the  above 

result  as  k  =  2(i+p)/2  u  where  Z(l+p)/2  *’s  tl*°  per¬ 

centile  point  of  the  standard  normal  distribution.  If  a  is 
known  and  p  is  unknown  then  the  above  result  is  used  with  <® 
degrees  of  freedom  (f  =  °°).  The  u  will  become  1,  and  k  =  r 
which  depends  onlv  on  n'  and  P.  Regardless  of  what  level  of 
Y  is  chosen  u  is  always  equal  to  one  in  the  case  where  o  is 
known. 
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IV,  RELATIONSHIP  BETWEEN  THE  VARIOUS  LIKITS 


a.  ggatgftgtg.  eft  thg  umjLts 


Figure  2  gives  an  oversimplified  comparison  between 
the  confidence  limits,  and  the  tolerance  limits  [(P)TL  and 
(y,P)TlJ  for  different  sample  sizes.  The  "picture"  was  drawn 
as  simply  as  possible  to  illustrate  the  basic  concepts,  but 
the  following  shortcomings  should  be  realized: 

1.  At  each  sample  size  (except  n=«>),  each  interval  is  an 
estimate  and  is  not  necessarily  symmetric  about  p. 

2.  At  each  sample  size  (except  n=“),  one  should  visualize 
many  confidence  interval  estimates  with  100y%  of 

them  covering  p,  many  (P)TL  estimates  whose  average 
interval  covers  100P%  of  the  population,  and  many 
(y,P)TL  with  100y%  of  these  intervals  covering  at 
least  100P%. 

3.  When  a  is  not  known,  all  estimates  mentioned  in  2 
(above)  will  usually  be  of  unequal  length. 

The  (P)TL  gives  an  estimate  of  the  interval  p  +  ko 
in  the  same  manner  as  Y  gives  an  estimate  of  the  point  p. 

The  (y,F)TL  are  in  nature  comparable  to  the  confidence  limits 
because  these  tolerance  limits  give  a  "confidence  interval" 
about  an  interval  (including  at  least  100 P%  of  the  popula¬ 
tion),  while  the  confidence  limits  give  a  confidence  interval 
about  a  point . 
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Figure  2.  Oversimplified  Comparison  Between  Confidence  Limits, 
(P)TL,  and  (y,P)TL  on  a  Simple  Mean  for  Different 
Sample  Sizes. 
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T)  TJ 


For  a  very  large  sample  the  confidence  limits  con¬ 
verge  to  one  point .  the  parameter  (see  Figure  2).  This  can 


easily  be  verified  from  the  previous  formulas.  As  sample 
size  and  degrees  of  freedom  increase  for  the  normal  distri¬ 
bution  the  (y,P)TL  and  the  (P)TL  approach  essentially  two 
limiting  parameters  with  100%  confidence  including  the  pro¬ 
portion  P  of  the  population. 

B.  Sifl^fteritv  Between  Confidence  Limits  and  Tolerance  Limits 

LonoJ 


The  following  is  based  on  Proschan's  article.  Fre¬ 
quently,  experimenters  are  interested  in  finding  a  prediction 
(or  "confidence")  interval  for  an  additional  observation 
from  the  same  population.  Host  standard  statistical  texts 
(16)  show  that 


t  = 


Y  -Y 
1  X2 


~[E*Y2  -  (LYi)J/n1]  +[^  -  (EY2)2/n2]' 

1  +  I_ 

n1«'2-2  J 

-nl  n2- 

is  distributed  as  Student' s-t  with  f  =  ^+^-2.  One  may  now 
use  this  relationship  to  find  the  following  prediction  in¬ 
terval  for  the  value  of  one  additional  observation  Y^n^55!): 


* 

All  Z  = 


or 
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* 

£ 


Pr 


Y,  - 


t(l+-Y)/2,n1-l  Sj<  Y2<  Yt 

+  t(l+Y)/2,nrl  ^nl+0/nl  81 


=  Y  (4.1) 


where 


K  -  av2/ni 


V1 


This  simoly  means  that  if  pairs  of  samples  of  size 
and  1  for  Y  and  Y  f respectively ,  are  drawn  repeatedly, 
then  100y%  of  the  Y^'s  will  lie  in  the  above  interval.  It 
does  not  mean  that  if  one  sample  of  size  n^(Y^)  were  drawn, 
to  be  followed  by  the  drawing  of  many  additional  Y^'s  that 
100y%  of  these  Y2's  will  lie  in  the  interval. 

Notice  that  the  I00y%  confidence  limits  for  the 
value  of  one  additional  observation  (4.1)  is  the  same  as  the 
(P)TL  (3.1)  except  for  the  subscript  on  t,  remembering  that 

c(l-P)/2,n-l  =  "  t(l+P)/2,n-l  *  How  is  this  confidence  or 
prediction  interval  related  to  the  (P)TL  ?  An  intuitive  ex¬ 
planation  of  their  relationship  may  go  as  follows.  The 
±  t(i+y)/2  n  -1  '/Tl7n7+r  s^  in  (4.1)  is  an  estimate  of 
p  ±  t(i>+y)/2  c,  Vl  5,  and  substituting,  (4.1)  would  become 

°<Y2<u+ta-vY)/2,. 0]  -  v- 


This  interval  is  fixed  and  contains  the  central  100y%  of  the 
future  Y2's  from  the  population.  Ihus  each  (4.1)  is  an 
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estimate  of  an  interval  which  contains  100/%  of  the  popu¬ 
lation.  However,  this  is  the  definition  of  (P)TL  in  Sec¬ 
tion  III,  replacing  y  with  P.  Hence,  confidence  limits  with 
confidence  coefficient  y  for  a  second  sample  of  size  one 
are  identical  with  tolerance  limits  that  will  include  a  pro¬ 
portion  P  on  the  average. 

Paulson  (13)  proves  the  following  simple  lemma  on 
the  relationship  between  confidence  limits  (y)  for  a  future 
random  observation  and  (P)  tolerance  limits:  If  confidence 
limits  . . . ,xfl)  and  (x^ , . . . ,x^)  on  «  orobabilitv  level 

=  Y  are  determined  for  g,  a  function  of  a  future  sample  of 
k  observations,  and 


then  E(P)  =  y.  Let\}/(g)  dg  and  (^(U^,^)  dU^  dl^  denote  the 
distribution  of  g  and  U^,  resoectively,  then  by  the  defi¬ 
nition  of  expected  value 


This  triple  integral  is,  however,  exactly  the  probability  that 
g  will  lie  between  and  l^,  which  by  the  nature  of  con- 
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fidence  limits  must  equal  y,  which  proves  the  lemma 


Following  the  procedure  of  computing  confidence 
limits  for  the  next  observation,  one  can  quite  easily  com¬ 
pute  (P)TL  for  any  variate  for  which  there  is  a  normally 
distributed  estimate  of  the  mean  with  variance  oz/n'  and 
the  estimate  of  the  variance  is  independently  distributed 
as  2/f  with  f  df.  For  example,  the  (P)TL  for  Y^-Y2  when 
given  n^  observations  from  the  Y^  population  and  n^  ob¬ 
servations  from  the  Y^  population  is  obtained  from 


Pr 


t(l-P)/2 


L 


<  <yy  -  <vv 

VsZ(i  +1  +1+1) 

"l  n2 


-  t(l+P)/2 


p 

where  sc  is  the  pooled  sample  variance.  This  expression  is 
then  rearranged  as  follows: 


Pr[<YV+tU-P>/2  £  VY2  £  <VV 


+td+p)/2 


~  P 


A  summary  of  the  computing  procedures  for  the  two- 
sided  confidence  limits  and  botn  types  of  tolerance  limits 
on  normal  populations  is  given  in  Table  1. 
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TABLE  1.  COMPUTATIONAL  PROCEDURES  OF  CONFIDENCE  LIMITS, 
(P)TL,  AND  (y,P)TL  FOR  NORMAL  POPULATIONS 


Source 

Parameters 

Sten  #  I 

Step  #  2 

Confidence 

cpunknown(U) 

Obtain  estimate 

Obtain  var(g)  ■= 

Limits 

o2U 

£  of  cp 

o^/n* 

If 

cp  U 

o2known(K) 

II 

II 

(?)TL 

v  u 

It 

Var.  g  +  var.  of 

f 


a2U 

S  of  cp 

c 

V 

CM 

O 

II 

*  u 

o2K 

It 

II 

II 

rp  K 

a2U 

- 

II 

If 

cp  K 

c2K 

- 

II 

(Table  1  continued.) 


Source 

Parameters 

Step  #  3 

Step  #  4 

Confidence 

Limits 

nunknown(U) 

a2U 

Obtain  estimate  of 

c2(called  s2) 

Decide  on 

Y 

tl 

cp  U 

a2known(K) 

- 

II 

(P)TL 

cp  U 

o2U 

Obtain  estimate  of 

o2(called  82) 

Decide  on 

P 

II 

co  U 

o2K 

If 

II 

cp  K 

a2U 

Obtain  estimate  of 

a2 (called  s2) 

It 

II 

+  K 

a2K 

- 

II 

(y,P)TL 

cp  U 

o2U 

Obtain  estimate  of 

a2  (called  s2) 

Decide  on 

Y  and  P 

If 

cp  U 

a2K 

- 

Decide  on 

P  only 

II 

cp  K 

o2U 

Obtain  estimate  of 

o^(called  s2) 

Decide  on 

y  and  P 

M 

cp  K 

o2K 

- 

Decide  on 

P  only 

(Table  1  continued.) 


\  - 


t* 


Source 


Confidence 

Limits 


(P)TL 


(y,p)tl 


Parameters 


cp  unknown  (  U  ) 
o2U 


*  U 

o2known(K) 


cp  U 

cr2U 


cp  U 

o2K 


Cp  K 

o2U 


cp  K 

a2K 


cp,  U 

a2U 


cp 


U 


a2K 


cp  K 

o2U 


cp  K 

a2K 


***See  oa^e  39 


(Table  1  continued.) 


★ 


** 


*** 


t  _  is  the  X,  percentage  point  of  Student' s-t  distribu- 
X ,  f 

tion  with  f  df. 


Formula  as  given  is  not  always  correct  depending  on  the 
cp  tinder  consideration.  See  page  34. 


is  the  percentage  point  of  the  ^  distribution 
with  f  df  which  will  be  exceeded  by  chance  lOOyX  of  the 
time. 
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V.  LIMITS  IN  SIMPLE  LINEAR  REGRESSION 


A.  Background 

In  linear  regression,  Y  values  are  obtained  from 
several  populations ,  each  population  being  determined  by  a 
corresponding  X  value.  The  X  variable  is  fixed  or  measured 
without  error.  The  following  assumptions  are  usually  made 
about  the  "true"  model: 

1.  The  distribution  of  Y  for  each  X  is  normal. 

2.  The  mean  values  of  Y  lie  exactly  on  the  line 

Vx  *  °  + 

o 

3.  The  variance  of  Y,  a  ,  is  the  same  for  each  X. 

4.  The  Y  observations  are  statistically  independent. 

The  classical  "least  squares"  procedure  is  used  for 

"fitting"  a  line  which  best  describes  the  linear  relation¬ 
ship  between  the  (X^,Y^)  oairs  of  observations.  This  pro¬ 
cedure  determines  values  of  a  and  b  which  minimize 

n  <5 

SSD  *  E  (Y  -a-bXa )  . 
i=l  i  i 

The  b  for  the  "fitted"  line  is  called  the  regression  co¬ 
efficient,  and  the  a  is  called  the  intercept.  The  line  is 
called  a  regression  line,  and  its  equation  is  called  a  re¬ 
gression  equation. 
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B.  Confidence  Limits 

1.  Non-simultaneous  confidence  limits 

Frequently  textbooks  give  I00y%  confidence  limits 
on  the  population  mean  of  Y  at  a  particular  XQ  value,  p 

Y*Xo 

The  concept  of  computing  confidence  limits  on  a  single 
normal  population  is  simply  applied  repeatedly  to  the  Y  data 
at  the  different  values  of  X.  The  intervals  are  not  inde¬ 
pendent  of  each  other  because  they  all  depend  on  the  same 
regression  line.  These  intervals  will  be  referred  to  as 
non-simultaneous  confidence  limits  (intervals). 

The  interpretation  for  any  one  of  these  populations 
is  that  if  many  samples  of  the  same  size  were  drawn  from  the 
same  population  of  Y's  at  XQ  and  an  interval  were  constructed 
for  each  sample,  then  one  would  expect  lOOyT,  of  these  "ran¬ 
dom  intervals"  to  cover  the  fixed  point  u 

Y-X0 

Procedure  A  for  the  computation  of  confidence  limits 
may  be  used  repeatedly  to  compute  100y%  non-simultaneous  con¬ 
fidence  limits  for  different  values  of  X  (call  the  X  under 

consideration,  X  ).  The  procedure  is  given  below  for  simple 
»  o 

linear  regression  problems  and  will  be  referred  to  as  Pro¬ 
cedure  C. 


Procedure  C 

1,  Y  =  a  +  bX  .  where 
o 
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b  = 


£*XY  - 
_ n 

EX2  -  l?x>2 
n 


Sxv 

Sx2 


and 


a  =  Y  -  bX 


2. 


Var(?).o2 

V.xL»  sx2  J 


3. 


fay2  -  (Sxy)2/ Sx2 
Y-X  V  n-2 

where  Sy2  »  EY2  -  (£Y)2/n 


4.  0«t(MY  x  )  -  Y  ±  t(l+Y)/2>( 
o 

with  f  »  n-2. 


1  .  (Xo-X)2l  _ 

n  "^T-J  V] 


If  each  confidence  limit  is  considered  a  function  of 
X,  then  the  limits  define  the  two  branches  of  a  hyperbola 
with  the  fitted  line  as  the  diameter.  The  interval  has  mini¬ 
mum  length  for  X  =  X,  and  its  length  increases  as  l(X-X)l 
increases. 


5F 

all  Z  m 


n 

Z 

i=l 
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2.  Simultaneous  confidence  limits 


As  mentioned  before,  repeated  use  of  the  non-simul- 
taneous  confidence  limits  would  result  in  error  because  of 
the  lack  of  independence  of  the  intervals.  In  1929,  Working 
and  Hotelling  (22)  worked  out  a  procedure  whereby  they  found 
a  confidence  region  for  an  entire  regression  line.  They 
computed  a  confidence  region,  not  an  interval,  which  covered 
the  whole  line,  not  only  one  point  on  the  line.  This  pro¬ 
cedure  later  turned  out  to  be  a  special  case  of  Scheffe's 
simultaneous  confidence  intervals  (15).  Wilks  (21)  gives  a 
proof  of  Scheffe's  method  for  simultaneous  confidence  in¬ 
tervals  in  his  text,  and  it  is  his  proof  that  is  given  in 
this  paper. 

The  basic  result  due  to  Scheffe  is  as  follows: 

Suppose  u'  =  (u^,...,^)  is  a  k-dimensional  random 
variable  having  normal  distribution 

N(^,Aa2) 


where  p'  =  (^^,1^2* •  •  •  is  the  vector  of  the  means  and 

A  is  the  variance-covariance  matrix  (non-singular)  with 

o 

elements  a^,  and  0  is  unknown.  Let  S  =  residual  sum  of 

squares,  then  S/o^  is  a  random  variable  independent  of 

(u^,...,!^)  which  follows  the  chi-square  distribution  with 

f  df.  Let  F  ,  ^  be  the  100y%  point  of  the  F-distribution 
Y,k,f 
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and  let  6  =  J{S/f)(v.?  "  IT  .  We  can  then  state  the  following 

Y,k,f 

theorem:  If  0  is  the  set  of  all  real  vectors  (c^,...,ck) 

where  c.,...,c  are  not  all  zero,  the  inequalities 

•  K 

Z  c.u.  -  a  c  c  <  Z  c.p  <  Z  c  u  +  bjZ  a^  c  c  (5.1) 

i  1  1  i,j  iJ  i  i  "  i  * i  "  i  1  i  i,j  ij  i  J 

hold  simultaneously  with  probability  y  for  all  (c  , ...,c  ) 

in  e. 

To  prove  the  theorem  one  should  first  note  that 
(u-p ) 'a'^Cu-p )/o2  =  (l/o2)  £  a^^(u  -p  )(u  -p  )  and  S/o2  are 

ij  i  i  J  j 

independent  random  variables  having  chi-square  distribution 

with  k  and  f  df,  respectively,  with  a*^  being  the  elements 

of  A  1.  Hence  (f/kS)  Z  a^Cu  -p  )(u  -p  )  has  F-distrlbution. 

i,j  i  x  j  5 

Therefore 


Pr  [  2  a1^  (u  -p  )(u  -p  )  <62]  *  y 
i,j  1  i  j  j  J 

where  62  =  (kS/f)  F 

Y.k.f 

Next  Wilks  makes  use  of  k-dimensional  geometric  con¬ 
cepts  and  terminology.  The  set  of  points  in  the  space  of 

(p  , . . .  ,p  )  for  which 
l  k 

Z  a^Cu  -p  )(u  -p  )  <  62 

i,j  1  1  i  i 

is  the  interior  of  a  100y%  confidence  ellipsoid  for  the 
true  parameter  point  (p  , . . .  ,p  )  centered  at  (u  , ...,u  ). 

1  k  1  iC 

If  one  considers  the  set  of  points  in  the  space  of  (p^,..., 


(5.2) 
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^  )  contained  between  all  possible  pairs  of  parallel  (k-1)- 
k 

dimensional  hyperplanes  tangent  to  this  ellipsoid  then  this 
set  of  points  constitutes  the  interior  of  the  ellipsoid  (5.2) 
and  the  probability  associated  with  this  set  is  y. 

Wilks  then  goes  on  to  show  that  for  any  particular 
choice  of  (c.,...,c  )  in  0  the  two  parallel  (k-l)-dimensional 

*  K 

hyperplanes  in  the  space  of  (p_,...tp  )  having  equations 

"  k 


Ecu  =  E  c  u  +  E  a  cc. 

i  i  i  i  i  i  i,J  U  i  J 


(5.3) 


are  tangent  to  the  ellipsoid 

E  aiJ^i-ui)(nj“Uj  )  =  6  2 
i,  j 


(5.4) 


Any  point  (p  , . . .  )  between  the  two  hyperplanes 

1  K 

(5.3)  satisfies  (5.1).  For  the  moment  let  =  y^ 

Then  (5.4)  can  be  written  as 

E  a^y.y.  =  62,  (5.5) 

ij  1  J 

and  the  equation  of  an  arbitrary  hyperplane  in  the  space  of 
(y  ,...,yk)  can  be  written  as 

E  c^y^  =  d.  (5.6) 

Now  one  must  find  the  two  values  of  d  for  which  the  hyper¬ 
plane  (5.6)  is  tangent  to  the  ellipsoid  (5.5),  Using  a  La¬ 
grange  multiplier  X,  one  must  find  the  stationary  ooints  in 
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the  (y, ,...,y  )-space  of 
1  k 


<|)  =  %X(62  -  E  a^y  y )  +  £  c  y  . 

i,j  1  j  i  1  1 

Differentiating  with  respect  to  y^  one  finds 

-X  E  a^y  +c  =  0  or 
i  i  j 

v  =  (1/X )  E  a  c  (5.7) 

i  j  ij  j 

Substituting  in  (5.4)  one  finds 

X  =  ±(1/6  WE  a  c  c  (5.8) 

i,j  lj  1  J 

From  (5.8),  (5.7),  and  (5.6)  one  finds 
d  =  + 

i»  j  ij  1  j 

Substituting  this  value  of  d  in  (5.6)  and  using  the  fact 

that  y  =  p  -u  ,  one  obtains  (5.3)  as  the  equations  of  the 
i  i  i 

two  parallel  tangent  hyoerolanes  for  specified  (c^,...,ck). 
This  implies  (5.1)  and  hence  proves  the  theorem. 

In  this  paper  one  uses  Scheff&'s  method  (S-Mathod) 
of  multiple  comparison  as  stated  in  the  preceding  theorem 
to  the  family  [a+3(X-X)J,  corresponding  tc  the  two-dimensional 
space  [cjOt-K^]  ,  i.e.  c^  =  1  and  C2  =  X-X.  With  this  pro¬ 
cedure  one  can  compute  confidence  limits  for  any  number  of 
different  X  values  and  say  that  all  of  the  intervals  simul¬ 
taneously  cover  the  corresponding  p  values  for  100y%  of 

Y  •  X 
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such  random  confidence  regions. 


The  results  from  the  S-Method  show  that  the  same  pro¬ 
cedure,  Procedure  C  on  page  4 1,  mav  be  used  to  compute  these 
simultaneous  confidence  limits  as  was  used  to  compute  the 
non-siraultaneous  confidence  limits  with  the  following  modifi¬ 
cation:  In  step  4,  the  quantity  J7F  "  ~~  is  used  instead 

Y  » n-z 

°f  C(1+y)/2 ,n-2* 

These  simultaneous  confidence  limits  also  define  the 
two  branches  of  a  hyperbola  with  the  fitted  line  as  the 
diameter.  As  might  be  expected,  for  a  given  y  level,  the 
branches  of  the  hyperbola  for  the  simultaneous  limits  are 
farther  apart  than  those  for  the  non- simultaneous  limits. 

C.  Non- Simultaneous  Tolerance  Limits 

1.  Non- simultaneous  (P)TL 

Frequently,  prediction  intervals  are  also  computed 
for  simple  linear  regression  problems  (11).  The  practical 
use  of  the  non-simultaneous  (P)TL  is  rather  restricted  since 
limits,  like  the  non-simultaneous  confidence  limits,  are  not 
independent  of  each  other.  The  same  is  true  here  as  was  for 
the  confidence  limits  in  that  the  concept  of  computing  a 
(P)TL  on  a  single  normal  population  is  applied  repeatedly  to 
the  Y  data  at  different  values  of  X. 

The  procedure  for  computing  non-siraultaneous  (P)TL 


is  the  same  as  -rocedure  C  on  page  41  for  computing  non- 
simultaneous  confidence  limits  with  the  following  modifi¬ 
cation  :  In  steD  2  of  the  procedure  the  variance  of  $  is 


Y-X 


1  +  1  + 
n 


(Xq->»2 

Sx2 


which  takes  into  consideration  the  variance  associated  with 
the  additional  observation. 

These  non- simultaneous  (P)TL  also  define  the  two 
branches  of  a  hyperbola  with  the  fitted  line  as  the  diameter. 
With  these  limits  one  can  rightfully  say  only  that  for  one 
future  XQ  value  100p%  of  the  Y  values  will  on  the  average 
lie  within  the  given  limits. 

2.  Non- simultaneous  (y,P)TL 

As  mentioned  before,  the  (P)TL  is  simply  an  estimate 
of  the  interval  and  it  does  not  give  the  experimenter  any 
assurance  of  including  at  least  a  desired  proportion  of  the 
population.  The  more  desirable  statement  would  include  at 
least  100P%  of  the  population  with  a  predetermined  level  of 
confidence  (y).  Whenever  textbooks  consider  tolerance  limits 
in  simple  regression,  the  non -simultaneous  (y,P)TL  are  most 
frequently  mentioned  (1),  (6). 

Procedure  B  on  page  26  is  used  repeatedly  for  different 
X  values  to  compute  the  non-siraultaneous  (y,p)TL.  Again, 
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the  loci  of  the  tolerance  limits  may  be  plotted  as  a  hyper¬ 
bola  with  the  fitted  line  as  diameter.  It  must  be  re¬ 
emphasized  that  these  limits  are  not  independent  of  each 
other  and  hence  do  not  hold  for  different  values  of  X  simul¬ 
taneously.  Generally,  these  limits  are  farther  apart  than 
the  non-simultaneous  (P)TL  when  using  a  reasonable  100y%  con¬ 
fidence  level. 

D.  Simultaneous  Tolerance  Limits 
1 .  Background 

Lieberman  (9)  first  considered  the  joint  prediction 
interval  for  the  response  at  each  of  K  separate  values  of  the 
independent  variable  when  all  K  predictions  must  be  based 
upon  the  original  fitted  model.  He  describes  three  methods, 
one  exact  and  two  approximate.  For  the  exact  method  the 
probability  is  100y%  that  all  K  future  observations  fall 
within  their  respective  intervals,  for  the  approximate 
methods  the  probability  is  greater  than  100y%. 

These  prediction  regions  apply  only  to  a  specified 
number  K  of  future  responses  at  each  of  K  separate  X  values. 
However,  when  K  is  unknown  and  possibly  arbitrarily  large 
these  results  are  no  longer  valid.  A  solution  to  the  problem 
of  arbitrary  K  is  given  in  terms  of  simultaneous  tolerance 
limits  (intervals)  on  the  distribution  of  future  observations. 
In  this  paper  two  types  of  simultaneous  tolerance  intervals 


i 
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will  he  considered-simultaneous  (P)TL  and  simultaneous  (y,P) 
TL. 


2.  Simultaneous  (P)TL 


In  an  attempt  to  overcome  the  limitation  of  the  non- 
siraultaneous  (P)TL  on  Y  at  a  particular  XQ,  simultaneous 
(P)TL  should  perhaps  be  considered  in  simple  linear  re¬ 
gression.  With  these  simultaneous  (P)TL,  one  may  say  that 
211  She  average  100 P%  of  the  Y  population  values  are  in¬ 
cluded  in  each  interval  and  that  this  statement  may  be 
made  for  any  number  of  different  X  values  simultane¬ 
ously. 

The  computing  procedure  for  these  simultaneous  (P)TL 
is  analogous  to  the  computation  of  simultaneous  confidence 
limits.  Thus  Procedure  C  on  page  41,  procedure  for  com¬ 
putation  of  non-simultaneous  confidence  limits,  may  be  used 
to  compute  the  simultaneous  (P)TL  with  the  following  two 
modifications ;  In  Step  2, 


var(Y)  = 

Y.X 


<y*)2 


i  +  ^  + 

"  Sx2 


and  in  Step  4,  V2F  2  n  2  USed  instead  of  t(i+Y)/2  n-2* 

As  expected,  for  a  given  P  and  y,  the  branches  of  the 
hyperbola  for  the  simultaneous  (P)TL  are  farther  apart  than 
those  for  the  non-simultaneous  (P)TL. 
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3.  Simultaneous  (y,F)TL 


liach  of  the  previously  mentioned  tolerance  limits 
procedures  in  simple  linear  regression  had  its  limitation. 
However,  one  can  see  that  the  limits  for  each  procedure 
were  getting  wider  (unfortunately),  but  closer  to  what  seems, 
in  most  cases,  to  be  in  what  the  exoerimenter  is  actually 
interested.  At  least,  each  successive  procedure  was  better 
than  simply  using  non- simultaneous  confidence  limits  and 
pretending  that  one  had  limits  which  included  a  given  per¬ 
centage  of  the  population  at  some  chosen  level  of  confidence. 
Simultaneous  (y,P)TL  aopear  to  be  the  oroper  limits  for 
most  experimenters  to  use. 

The  aoproach  used  in  the  oaper  for  the  derivation  of 
the  simultaneous  (y,P)TL  in  regression  is  the  simplest  of 
four  approaches  presented  by  Lieberman  and  Miller  (1^). 

The  authors  made  use  of  the  Bonferroni  inequality  P^ABj  > 

1  -  p[ac|  -  f[bc] ,  where  Ac  and  Bc  denote  the  complement  of 
A  and  B,  respectively.  In  this  approach  they  employed  the 
inequality  to  combine  simultaneous  confidence  intervals  on 
the  regression  means,  as  obtained  by  Scheffe,  and  the  con¬ 
fidence  interval  for  the  standard  deviation  to  construct  a 
two-sided  simultaneous  (y,F)TL.  The  two-sided  confidence 
region  for  the  regression  line  is  obtained  from 
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?r 


|jci+p(X-X)-a-b(X-X)|  <sy  ^(2F 


(1+Y )/2 ,2 ,n-2 
for  all  xl 


* 


=  (1+Y )/2 .  (5.9) 


An  upper  bound  on  o  is  obtained  from  a  one-sided  chi-square 
confidence  interval: 


Pr 

a  <  s 

f  n-2  n 

%  ' 

=  to  (5.10) 

”  Y-X 

2 

2 

- 

_*(l-Y)/2,n-2_ 

- 

where  ^  2  the  percentage  point  of  the 

chi-square  distribution  for  n-2  df.  With  use  of  the 
3onferroni  inequality  the  confidence  statements  (5.9)  and 
(5.10)  are  combined  into  a  joint  confidence  statement  with 
probability  greater  than  or  equal  to  y  as: 


|a+e(X-X)  ±  Z(1+p)/2  o-a-b(X-X)j  <  sy>x 


r 

(2F(l+Y)/2,2,n-2} 


Vi 


Sx2 


+  z 


(l+P)/2 


f  n-2 

for  all  x,P 

x2 

U(l-Y)/2,n-2J 

>  Y 


where  t^ie  ^+p^/2  percentage  point  of  the  stand¬ 

ard  normal  distribution. 

Lieberman  and  stiller  describe  the  simultaneous  (y,P) 
TL  in  simple  regression,  as  follows:  "If  for  a  single  re¬ 
gression  line  [  Y=a+b(XQ-X)]  one  asserts  that  the  proportion 
of  future  observations  falling  within  the  given  tolerance 
limits  (for  any  X),  is  at  least  P,  and  similar  statements 


52 


are  made  repeatedly  for  different  regression  lines  Y  =£a+ 
b(X^-X)j ,  then  for  100y%  of  the  different  regression  lines 
the  statements  will  be  correct'* .  One  may  reword  Lieberman 
and  Miller's  quotation  as  follows  in  order  to  give  an 
analogous  statement  for  the  (y,P)TL  in  Section  III:  "If 
for  a  single  mean,  Y,  one  asserts  that  the  proportion  of 
future  observations  falling  within  the  given  tolerance  limits 
is  at  least  P,  and  similar  statements  are  reoeatedly  for 
different  estimates  of  the  mean,  then  for  100y%  of  the 
different  estimates  the  statements  will  be  correct. " 

The  authors  did  not  appear  to  have  any  strong  pre¬ 
ference  for  any  one  of  their  four  procedures.  They  then  go 
on  to  say,  "The  widthsof  these  simultaneous  limits  (talking 
about  the  four  procedures  in  general)  vary  from  slightly 
larger  to  about  twice  as  large  as  the  non-simultaneous  in¬ 
tervals.  This  gives  a  rough  indication  of  the  price  the  ex¬ 
perimenter  will  have  to  pay,  or  should  be  paying,  for  simul¬ 
taneity".  Many  experimenters  may  feel  that  these  limits 
will  be  too  large  to  be  of  any  practical  benefit.  In  these 
situations,  depending  on  the  nature  of  the  data,  the  ex¬ 
perimenter  should  settle  for  smaller  P  and/or  smaller  Y 
levels.  Smaller  or  more  desirable  limits  are  not  necessarily 
justified  when  obtained  by  a  procedure  which  should  not 
have  been  used  or  a  procedure  which  gives  less  precise  in¬ 
formation. 


i 
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form  Y 
cedure 


The  computation  of  the  simultaneous  (y,P)TL  of  the 

+  k's  in  3imple  linear  regression  is  given  in  Pro- 
Y  *X 

D  (fixed  central  proportion  P  for  all  X's): 


Procedure  D 


1.  Y  =  Y  +  b(XQ-X) 

2.  var(Y)  =  (d) 

Y  •  X 


i 

where  d  =  ^  + 


(X„-X)2 

Sx^ 


3. 


-  (  Sxy 


Y.X 


n-2 


4.  Decide  on  reasonable  levels  of  P  and  y 

5*  k'  =  ‘^2F(l+y )/2 ,2,n-2  &  +  Z(l+P)/2  VCn*Z J/x(1.Y)/2>n-2 

6.  Y  +  k's 

Y-X 

7.  Steps  (1),(2),(5),  and  (6)  should  be  repeated  for 
several  X  values  (covering  the  range  of  X's).  The 
loci  of  the  limits  may  be  plotted  as  a  hyperbola  with 
the  fitted  line  as  diameter. 


E.  EtefigePgjon  Through  ,the_0£i&.4n 

In  some  situations  the  relationship  between  Y  and  X 
is  such  that  when  X=0  also  Y=0.  Thus,  one  is  interested  in 
passing  the  regression  line  through  the  origin,  and  the  re¬ 
quired  equation  is  of  the  type,  p  =0X.  As  in  the  previous 

Y  «X 
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case,  it  is  assumed  that  deviations  from  the  regression  line 

are  normally  distributed  with  a  common  variance.  Of  course, 

the  parameter  estimates  for  this  model  are  not  the  same  as 

for  the  previous  model,  u  =  a  +  gX. 

Y-X 

The  same  procedure  (Procedure  C)  for  the  computation 
of  non- simultaneous  confidence  limits  may  be  applied  to  this 
model  as  was  used  for  the  previous  model  using  the  different 
estimates : 


1.  a  L*xiYi 

Y  =  bX  where  b  =  - 

EX2 


2-  Var(Y)  =  a2 

°  Y-X 


xl 

*-  i-' 


3.  s'  =  -/CY2  -  ((EX yyi/zxh 
Y-X  i  i  i  1 


with  n-1  degrees  of  freedom  (f) 

4.  Confidence  limits  for  u  =Y+t,,  N  . 

Y-X_  "  (1+Y  )/2  ,f 
o 


rj£ 

l“L 


Y-X 


For  XQ=0  (the  origin),  the  above  procedure  shows  a 
confidence  interval  of  0.  Initially  one  may  feel  that  this 
is  incorrect.  However,  for  this  point  there  is  no  sampling 


S - 

All  E  = 


n 

2  . 
i=l 
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variation,  the  regression  equation  was  "forced"  through  this 
point.  It  is  easy  to  see  that  these  confidence  intervals 
increase  as  Xq  increases.  This  "fan"  appearance  of  the  con¬ 
fidence  limits  is  tinlike  the  hyperbolic  confidence  limits  ob¬ 
tained  for  the  previous  model. 

The  remainder  of  the  confidence  and  tolerance  in¬ 
tervals  can  be  computed  for  p  =  pX  using  the  basic 

Y  *X 

quantities  given  in  the  procedure  on  the  previous  page. 
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VI.  NUMERICAL  EXAMPLE 


A  summary  of  the  computing  formulas  for  the  various 
confidence  and  tolerance  limits  in  simple  linear  regression 
are  given  in  Table  2.  The  values  from  the  various  distri¬ 
butions  have  all  been  given  in  terms  of  the  P-distribution 
in  this  table. 

A  numerical  example  has  been  presented  so  that  the 
reader  can  appreciate  to  a  fuller  extent  the  various  com¬ 
putational  procedures,  and  can  graphically  see  the  difference 
(if  any)  in  the  interval  widths  for  the  various  procedures. 

The  example  used  in  this  paper  is  the  same  as  the 
numerical  example  presented  in  Lieberman  &  Miller* s  paper 
using  15  hypothetical  pairs  of  values  on  speed  of  a  missile 
(Y)  and  orifice  opening  (X).  The  underlying  relationship 
between  these  two  variables  is  of  the  form 

Expected  speed  (miles/hr)  =  a  +  0  orifice  opening  (inches). 

The  necessary  quantities  from  the  data  for  the  desired  com¬ 
putations  were  [as  given  in  (10)]  : 

X  =  1.3531 

Y  =  5219.3 

Sx2  =  Z(X-xf  =  .011966 

Y  =  -19,041.9  +  17930X 
s  »  130.5  with  f  =  13 
n  =  15 
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TABLE  2.  COMPUTATIONAL  PROCEDURES  FOR  VARIOUS  TYPES  OF 

CONFIDENCE  AND  TOLERANCE  LIMITS  IN  SIMPLE  LINEAR 
REGRESSION 


I 


Step  # 

Source 

1 

2 

3 

Non-simultaneous 
confidence  limits 
(Procedure  C) 

Y=a+bX 

o 

_  . 

a2 

Y.X 

=a2 

Y-X 

[~i  , 

n  Sx2 

(d) 

L2.X5sO' 

8y-x7  s*2 

V  n-2 

Simultaneous 
confidence  limits 

it 

If 

II 

Non-s imu 1 taneous 
(P)TL 

it 

a2  (1+d) 

Y-X 

II 

Simultaneous 

(P)TL 

n 

II 

II 

Non- s imu 1 taneous 

(y,p;tl 

(Procedure  3) 

ii 

a2  (d) 

Y-X 

II 

Simultaneous 

(y,p)tl 

(Procedure  D) 

it 

II 

II 

Notes :  a 


b 


Y-bX 


^xy  -  1 

n 

^X2  -  IS2Q2 

n 


ly2  -  xm2 

n 
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Non-simultaneous  „  _  _ 

confidence  limits  Y+^F  J6  s 

(Procedure  C)  Y,l»n-2  Y»X 


Simultaneous  r, ,  -  /T 

confidence  limits  X3>/  y ,2,n-2  v  Y.x 


Kern-simultaneous 


Y+VF*  s 

~  P,l,n-2  v  Y.X 


Simultaneous 


Non- s imu 1 taneou s 

(y,p)tl 

(Procedure  B) 


Y+V2FP  j  n  o  8 

P,2,n-Z  Y.x 


k=VF 


P,l,~ 


^5™’  k'  *  •/2F(l+v)/2,2,n-2  ^ 

(Procedure  D)  +^P,l,»ran)/2,-,n-2 


^  Percenta8e  point  of  the  F  dis 
tribution  with  V ^  and  V «  degrees  of  freedom. 


i 

i 


It  was  decided  that  r  =  .95  and  y  =  .95  were  reasonable 
values  to  use.  Figure  3  shows  a  tolerance  band  for  each  of 
the  six  types  of  limits  considered  in  regression  when  using 
P  =  .95,  y  -  .95  and  n  =  15.  Generally  all  tolerance  bands 
are  wide  and  the  price  for  simultaneity  appears  high.  The 
cause  of  the  wide  limits  is  two-fold.  One  cause  is  that  s 
(basic  standard  deviation)  is  perhaps  larger  than  what  one 
would  observe  under  a  carefully  controlled  situation.  The 
second  cause  of  the  wide  tolerance  limits  is  that  either  the 
level  of  confidence  (y=.95)  or  the  proportion  of  the  popu¬ 
lation  to  be  included  (F=.95)  or  both  were  chosen  too  large 
in  respect  to  only  the  15  pairs  of  observations  used  in  the 
sample.  In  other  words,  one  should  pay  a  high  price  (large 
limits)  if  it  is  expected  that  a  sample  size  of  15  should 
supply  the  basic  information  for  perhaps  hundreds  of  future 
predictions . 

In  order  to  explore  the  effect  of  sample  size,  it 
was  decided  to  use  the  same  data  under  the  condition  that 
it  were  based  on  150  pairs  of  observations  rather  than  only 
15  (essentially  10  pairs  of  observations  at  each  point). 
Figure  4  shows  a  band  for  each  of  the  six  types  of  limits 
using  P  -  .95,  y  =  .95  and  n=  150.  From  these  data  one  sees 
a  clear  distinction  between  confidence  and  tolerance  bands. 
The  price  of  simultaneity  has  become  less  for  both  the  con¬ 
fidence  and  the  tolerance  limits.  The  non- simultaneous 
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Six  Types  of  Limits  for  a  Simple  Linear  Re¬ 
gression  Problem  Using  y=*95,  P=.95  and  N=150 
Ttssentially  10  pairs/pt.j 


(95%)  TL  do  not  differ  much  from  the  simultaneous  (95%)TL. 

The  same  is  true  for  the  simultaneous  and  non- simultaneous 
(95%t95%)TL. 

In  order  to  see  what  role  the  chosen  level  of  v  plays, 
it  was  decided  to  compute  a  tolerance  band  for  each  of  the 
six  types  of  limits  when  using  P  =  .95,  y  =  .75  and  n  -  15. 
<J3ee  Figure  5.)  All  limits  involving  y  are  about  80%  as  wide 
as  the  limits  when  using  P=.95,  y=.95  and  n=  15.  Of  course, 
both  (95%TL)  are  the  same  as  in  Figure  3. 

Figure  6  shows  the  limits  for  a  sample  size  of  150, 
P=.95  and  y=.75.  Figures  4  and  6  (n=150  for  both)  are  nearly 
identical.  This  shows  that  for  a  reasonably  large  sample 
size  the  chosen  level  of  y  has  very  little  influence  on  the 
width  of  the  confidence  or  tolerance  limits. 

Many  of  the  observations  made  from  the  sample  problem 
could  also  be  made  by  comparing  the  F-ratio  values  used  in 
the  computing  formulas  in  Table  2. 
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VII.  RELATED  MATERIAL  WOT  COVERED  IN  THE  PAPER 


The  material  in  this  paper  was  limited  to  two-sided 
confidence  and  tolerance  limits  apolied  to  simple  means  and 
simple  linear  regression  lines.  Other  areas  of  major  interest 
are : 

1.  One-sided  confidence  and  tolerance  limits. 

2.  Application  of  the  limits  to  multiple  (fixed  X)  linear 
r egr e s s ion  problems. 

3.  Application  of  the  limits  to  simple  linear  regression 
lines  where  X  is  measured  with  error. 

4.  The  simplest  of  Lieberman  &  Miller's  procedure  on 
simultaneous  "P%  TL  with  v%"  was  chosen  for  this 
paper.  Further  comparisons  between  the  four  r>ro- 
cedures  under  a  variety  of  conditions  would  be  of 
interest. 

5.  What  price,  if  any,  does  the  investigator  have  to  pay 
to  be  able  to  make  tolerance  statements  at  various 
values  of  X  not  necessarily  at  the  same  level  P,  but 
still  have  one  over-all  y  confidence  level  compared 
to  a  fixed  P  level  statement  as  given  in  this  report 
with  the  same  over-all  y  level  of  confidence. 

6.  Inverse  prediction  intervals  whereby  an  interval  of 
X  values  is  found  for  which  the  additional  Y  obs. 
could  be  associated,  and  one  is  100y%  confident  that 
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at  least  100P%  of  these  intervals  will  include  the 
true  associated  XQ  value  (population  Xq). 

7.  Nonparametric  confidence  and  tolerance  limits. 
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